AITopics | goal description

Collaborating Authors

goal description

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MLFM: Multi-Layered Feature Maps for Richer Language Understanding in Zero-Shot Semantic Navigation

Raychaudhuri, Sonia, Cancelli, Enrico, Campari, Tommaso, Ballan, Lamberto, Savva, Manolis, Chang, Angel X.

arXiv.org Artificial IntelligenceOct-20-2025

Recent progress in large vision-language models has driven improvements in language-based semantic navigation, where an embodied agent must reach a target object described in natural language. Y et we still lack a clear, language-focused evaluation framework to test how well agents ground the words in their instructions. We address this gap by proposing LangNav, an open-vocabulary multi-object navigation dataset with natural language goal descriptions (e.g. 'go to the red short candle on the table') and corresponding fine-grained linguistic annotations (e.g., attributes: color=red, size=short; relations: support=on). These labels enable systematic evaluation of language understanding. To evaluate on this setting, we extend multi-object navigation task setting to Language-guided Multi-Object Navigation (LaMoN), where the agent must find a sequence of goals specified using language. Furthermore, we propose Multi-Layered Feature Map (MLFM), a novel method that builds a queryable, multi-layered semantic map from pretrained vision-language features and proves effective for reasoning over fine-grained attributes and spatial relations in goal descriptions. Experiments on LangNav show that MLFM outperforms state-of-the-art zero-shot mapping-based navigation baselines.

large language model, natural language, relation, (17 more...)

arXiv.org Artificial Intelligence

2507.07299

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > Canada > Alberta (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Discovering Temporal Structure: An Overview of Hierarchical Reinforcement Learning

Klissarov, Martin, Bagaria, Akhil, Luo, Ziyan, Konidaris, George, Precup, Doina, Machado, Marlos C.

arXiv.org Artificial IntelligenceJun-18-2025

Developing agents capable of exploring, planning and learning in complex open-ended environments is a grand challenge in artificial intelligence (AI). Hierarchical reinforcement learning (HRL) offers a promising solution to this challenge by discovering and exploiting the temporal structure within a stream of experience. The strong appeal of the HRL framework has led to a rich and diverse body of literature attempting to discover a useful structure. However, it is still not clear how one might define what constitutes good structure in the first place, or the kind of problems in which identifying it may be helpful. This work aims to identify the benefits of HRL from the perspective of the fundamental challenges in decision-making, as well as highlight its impact on the performance trade-offs of AI agents. Through these benefits, we then cover the families of methods that discover temporal structure in HRL, ranging from learning directly from online experience to offline datasets, to leveraging large language models (LLMs). Finally, we highlight the challenges of temporal structure discovery and the domains that are particularly well-suited for such endeavours.

large language model, machine learning, reinforcement learning, (23 more...)

arXiv.org Artificial Intelligence

2506.14045

Country:

North America > Canada > Alberta (0.14)
North America > Canada > Quebec > Montreal (0.04)
North America > United States > Texas (0.04)
(7 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.47)

Industry:

Health & Medicine (1.00)
Education (1.00)
Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(8 more...)

Add feedback

Infer Human's Intentions Before Following Natural Language Instructions

Wan, Yanming, Wu, Yue, Wang, Yiping, Mao, Jiayuan, Jaques, Natasha

arXiv.org Artificial IntelligenceSep-26-2024

For AI agents to be helpful to humans, they should be able to follow natural language instructions to complete everyday cooperative tasks in human environments. However, real human instructions inherently possess ambiguity, because the human speakers assume sufficient prior knowledge about their hidden goals and intentions. Standard language grounding and planning methods fail to address such ambiguities because they do not model human internal goals as additional partially observable factors in the environment. We propose a new framework, Follow Instructions with Social and Embodied Reasoning (FISER), aiming for better natural language instruction following in collaborative embodied tasks. Our framework makes explicit inferences about human goals and intentions as intermediate reasoning steps. We implement a set of Transformer-based models and evaluate them over a challenging benchmark, HandMeThat. We empirically demonstrate that using social reasoning to explicitly infer human intentions before making action plans surpasses purely end-to-end approaches. We also compare our implementation with strong baselines, including Chain of Thought prompting on the largest available pre-trained language models, and find that FISER provides better performance on the embodied social reasoning tasks under investigation, reaching the state-of-the-art on HandMeThat.

instruction, reasoning, robot, (15 more...)

arXiv.org Artificial Intelligence

2409.18073

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Consumer Products & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

MUTEX: Learning Unified Policies from Multimodal Task Specifications

Shah, Rutav, Martín-Martín, Roberto, Zhu, Yuke

arXiv.org Artificial IntelligenceSep-25-2023

Humans use different modalities, such as speech, text, images, videos, etc., to communicate their intent and goals with teammates. For robots to become better assistants, we aim to endow them with the ability to follow instructions and understand tasks specified by their human partners. Most robotic policy learning methods have focused on one single modality of task specification while ignoring the rich cross-modal information. We present MUTEX, a unified approach to policy learning from multimodal task specifications. It trains a transformer-based architecture to facilitate cross-modal reasoning, combining masked modeling and cross-modal matching objectives in a two-stage training procedure. After training, MUTEX can follow a task specification in any of the six learned modalities (video demonstrations, goal images, text goal descriptions, text instructions, speech goal descriptions, and speech instructions) or a combination of them. We systematically evaluate the benefits of MUTEX in a newly designed dataset with 100 tasks in simulation and 50 tasks in the real world, annotated with multiple instances of task specifications in different modalities, and observe improved performance over methods trained specifically for any single modality. More information at https://ut-austin-rpl.github.io/MUTEX/

modality, specification, task specification, (14 more...)

arXiv.org Artificial Intelligence

2309.1432

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Add feedback

Improving Knowledge Extraction from LLMs for Task Learning through Agent Analysis

Kirk, James R., Wray, Robert E., Lindes, Peter

arXiv.org Artificial IntelligenceAug-22-2023

Large language models (LLMs) offer significant promise as a knowledge source for task learning. Prompt engineering has been shown to be effective for eliciting knowledge from an LLM, but alone it is insufficient for acquiring relevant, situationally grounded knowledge for an embodied agent learning novel tasks. We describe a cognitive-agent approach that extends and complements prompt engineering, mitigating its limitations and thus enabling an agent to acquire new task knowledge matched to its native language capabilities, embodiment, environment, and user preferences. The approach is to increase the response space of LLMs and deploy general strategies, embedded within the autonomous agent, to evaluate, repair, and select among candidate responses produced by the LLM. We describe the approach and experiments that show how an agent, by retrieving and evaluating a breadth of responses from the LLM, can achieve 77-94% task completion in one-shot learning without user oversight. The approach achieves 100% task completion when human oversight (such as an indication of preference) is provided. Further, the type of oversight largely shifts from explicit, natural language instruction to simple confirmation/discomfirmation of high-quality responses that have been vetted by the agent before presentation to a user.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2306.0677

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.82)

Industry: Government (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Task-Driven Graph Attention for Hierarchical Relational Object Navigation

Lingelbach, Michael, Li, Chengshu, Hwang, Minjune, Kurenkov, Andrey, Lou, Alan, Martín-Martín, Roberto, Zhang, Ruohan, Fei-Fei, Li, Wu, Jiajun

arXiv.org Artificial IntelligenceJun-23-2023

Embodied AI agents in large scenes often need to navigate to find objects. In this work, we study a naturally emerging variant of the object navigation task, hierarchical relational object navigation (HRON), where the goal is to find objects specified by logical predicates organized in a hierarchical structure - objects related to furniture and then to rooms - such as finding an apple on top of a table in the kitchen. Solving such a task requires an efficient representation to reason about object relations and correlate the relations in the environment and in the task goal. HRON in large scenes (e.g. homes) is particularly challenging due to its partial observability and long horizon, which invites solutions that can compactly store the past information while effectively exploring the scene. We demonstrate experimentally that scene graphs are the best-suited representation compared to conventional representations such as images or 2D maps. We propose a solution that uses scene graphs as part of its input and integrates graph neural networks as its backbone, with an integrated task-driven attention mechanism, and demonstrate its better scalability and learning efficiency than state-of-the-art baselines.

artificial intelligence, machine learning, navigation, (15 more...)

arXiv.org Artificial Intelligence

2306.1376

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Learning When and What to Ask: a Hierarchical Reinforcement Learning Framework

Nguyen, Khanh, Bisk, Yonatan, Daumé, Hal III

arXiv.org Artificial IntelligenceOct-13-2021

Reliable AI agents should be mindful of the limits of their knowledge and consult humans when sensing that they do not have sufficient knowledge to make sound decisions. We formulate a hierarchical reinforcement learning framework for learning to decide when to request additional information from humans and what type of information would be helpful to request. Our framework extends partially-observed Markov decision processes (POMDPs) by allowing an agent to interact with an assistant to leverage their knowledge in accomplishing tasks. Results on a simulated human-assisted navigation problem demonstrate the effectiveness of our framework: aided with an interaction policy learned by our method, a navigation policy achieves up to a 7x improvement in task success rate compared to performing tasks only by itself. The interaction policy is also efficient: on average, only a quarter of all actions taken during a task execution are requests for information. We analyze benefits and challenges of learning with a hierarchical policy structure and suggest directions for future work.

agent, information, interaction policy, (16 more...)

arXiv.org Artificial Intelligence

2110.08258

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Maryland (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.55)

Add feedback

Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives

Goyal, Anirudh, Sodhani, Shagun, Binas, Jonathan, Peng, Xue Bin, Levine, Sergey, Bengio, Yoshua

arXiv.org Artificial IntelligenceJun-25-2019

Reinforcement learning agents that operate in diverse and complex environments can benefit from the structured decomposition of their behavior. Often, this is addressed in the context of hierarchical reinforcement learning, where the aim is to decompose a policy into lower-level primitives or options, and a higher-level meta-policy that triggers the appropriate behaviors for a given situation. However, the meta-policy must still produce appropriate decisions in all states. In this work, we propose a policy design that decomposes into primitives, similarly to hierarchical reinforcement learning, but without a high-level meta-policy. Instead, each primitive can decide for themselves whether they wish to act in the current state. We use an information-theoretic mechanism for enabling this decentralized decision: each primitive chooses how much information it needs about the current state to make a decision and the primitive that requests the most information about the current state acts in the world. The primitives are regularized to use as little information as possible, which leads to natural competition and specialization. We experimentally demonstrate that this policy architecture improves over both flat and hierarchical policies in terms of generalization.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

1906.10667

Country:

North America > Canada (0.46)
North America > United States (0.46)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Automated Transformation of PDDL Representations

Riddle, Patricia J. (University of Auckland) | Barley, Michael W (University of Auckland) | Franco, Santiago (University of Auckland) | Douglas, Jordan (University of Auckland)

AAAI ConferencesMay-21-2015

This paper describes a system that automatically transforms a PDDL encoding, calls a planner to solve the transformed representation, and translates the solution back into the original representation. The approach involves counting objects that are indistinguishable, rather than treating them as individuals, which eliminates some unnecessary combinatorial explosion.

goal description, predicate, representation, (14 more...)

AAAI Conferences

Eighth Annual Symposium on Combinatorial Search

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.06)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)
Asia > Middle East > Jordan (0.05)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.76)

Add feedback

Goal Recognition through Goal Graph Analysis

Hong, J.

Journal of Artificial Intelligence ResearchJul-1-2001

We present a novel approach to goal recognition based on a two-stage paradigm of graph construction and analysis. First, a graph structure called a Goal Graph is constructed to represent the observed actions, the state of the world, and the achieved goals as well as various connections between these nodes at consecutive time steps. Then, the Goal Graph is analysed at each time step to recognise those partially or fully achieved goals that are consistent with the actions observed so far. The Goal Graph analysis also reveals valid plans for the recognised goals or part of these goals. Our approach to goal recognition does not need a plan library. It does not suffer from the problems in the acquisition and hand-coding of large plan libraries, neither does it have the problems in searching the plan space of exponential size. We describe two algorithms for Goal Graph construction and analysis in this paradigm. These algorithms are both provably sound, polynomial-time, and polynomial-space. The number of goals recognised by our algorithms is usually very small after a sequence of observed actions has been processed. Thus the sequence of observed actions is well explained by the recognised goals with little ambiguity. We have evaluated these algorithms in the UNIX domain, in which excellent performance has been achieved in terms of accuracy, efficiency, and scalability.

algorithm, goal graph, time step, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.830

AI Access Foundation

10280

Journal of Artificial Intelligence Research

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > Massachusetts > Middlesex County > Reading (0.04)
Europe > United Kingdom > Northern Ireland > County Antrim > Newtownabbey (0.04)

Genre:

Research Report (0.66)
Overview (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling > Plan Recognition (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.93)

Add feedback